Constructing Common Factors from Continuous and Categorical Data
نویسندگان
چکیده
The method of principal components is widely used to estimate common factors in large panels of continuous data. This paper first reviews alternative methods that obtain the common factors by solving a Procrustes problem. While these matrix decomposition methods do not specify the probabilistic structure of the data and hence do not permit statistical evaluations of the estimates, they can be extended to analyze categorical data. This involves the additional step of quantifying the ordinal and nominal variables. The paper then reviews and explores the numerical properties of these methods. An interesting finding is that the factor space can be quite precisely estimated directly from categorical data without quantification. This may require using a larger number of estimated factors to compensate for the information loss in categorical variables. Separate treatment of categorical and continuous variables may not be necessary if structural interpretation of the factors is not required, such as in forecasting exercises.
منابع مشابه
Correct Looping Arrows from Cyclic Terms - Traced Categorical Interpretation in Haskell
Arrows involving a loop operator provide an interesting programming methodology for looping computation. On the other hand, Haskell can define cyclic data structures by recursive definitions. This paper shows that there exists a common principle underlying both cyclic data and cyclic computations of arrow programs. We examine three concrete examples of constructing looping arrows from a syntact...
متن کاملAn Approach to Constructing Nested Space-Filling Designs for Multi-Fidelity Computer Experiments.
Multi-fidelity computer experiments are widely used in many engineering and scientific fields. Nested space-filling designs (NSFDs) are suitable for conducting such experiments. Two classes of NSFDs are currently available. One class is based on special orthogonal arrays of strength two and the other consists of nested Latin hypercube designs. Both of them assume all factors are continuous. We ...
متن کاملAn Eager Regression Method Based on Selecting Appropriate Features
This paper describes a machine learning method, called Regression by Selecthtg Best P~’ttllll’es (RSBF). RSBF consists of two phases: The first phase aims to find the predictive power of each feature by constructing simple linear regression lines, one per each continuous feature and number of categories pen each categorical feature. Although the predictive power of a continuous feature is const...
متن کاملDistinguishing Between Latent Classes and Continuous Factors: Resolution by Maximum Likelihood?
Latent variable models exist with continuous, categorical, or both types of latent variables. The role of latent variables is to account for systematic patterns in the observed responses. This article has two goals: (a) to establish whether, based on observed responses, it can be decided that an underlying latent variable is continuous or categorical, and (b) to quantify the effect of sample si...
متن کاملBayesian inference for square contingency tables
Inference for multivariate categorical data often proceeds by selecting a log-linear model from a set of competing models or, in a Bayesian approach, by averaging inferences over the set, weighted by posterior probabilities. In this paper, we use permutation invariance as a criterion for constructing a set of models for this purpose, for the common situation when the data form a ‘square’ contin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012